114 research outputs found
Detection of regulator genes and eQTLs in gene networks
Genetic differences between individuals associated to quantitative phenotypic
traits, including disease states, are usually found in non-coding genomic
regions. These genetic variants are often also associated to differences in
expression levels of nearby genes (they are "expression quantitative trait
loci" or eQTLs for short) and presumably play a gene regulatory role, affecting
the status of molecular networks of interacting genes, proteins and
metabolites. Computational systems biology approaches to reconstruct causal
gene networks from large-scale omics data have therefore become essential to
understand the structure of networks controlled by eQTLs together with other
regulatory genes, and to generate detailed hypotheses about the molecular
mechanisms that lead from genotype to phenotype. Here we review the main
analytical methods and softwares to identify eQTLs and their associated genes,
to reconstruct co-expression networks and modules, to reconstruct causal
Bayesian gene and module networks, and to validate predicted networks in
silico.Comment: minor revision with typos corrected; review article; 24 pages, 2
figure
Parent-of-origin-specific allelic associations among 106 genomic loci for age at menarche.
Age at menarche is a marker of timing of puberty in females. It varies widely between individuals, is a heritable trait and is associated with risks for obesity, type 2 diabetes, cardiovascular disease, breast cancer and all-cause mortality. Studies of rare human disorders of puberty and animal models point to a complex hypothalamic-pituitary-hormonal regulation, but the mechanisms that determine pubertal timing and underlie its links to disease risk remain unclear. Here, using genome-wide and custom-genotyping arrays in up to 182,416 women of European descent from 57 studies, we found robust evidence (P < 5 × 10(-8)) for 123 signals at 106 genomic loci associated with age at menarche. Many loci were associated with other pubertal traits in both sexes, and there was substantial overlap with genes implicated in body mass index and various diseases, including rare disorders of puberty. Menarche signals were enriched in imprinted regions, with three loci (DLK1-WDR25, MKRN3-MAGEL2 and KCNK9) demonstrating parent-of-origin-specific associations concordant with known parental expression patterns. Pathway analyses implicated nuclear hormone receptors, particularly retinoic acid and γ-aminobutyric acid-B2 receptor signalling, among novel mechanisms that regulate pubertal timing in humans. Our findings suggest a genetic architecture involving at least hundreds of common variants in the coordinated timing of the pubertal transition
Functional transcription factor target discovery via compendia of binding and expression profiles
Genome-wide experiments to map the DNA-binding locations of
transcription-associated factors (TFs) have shown that the number of genes
bound by a TF far exceeds the number of possible direct target genes.
Distinguishing functional from non-functional binding is therefore a major
challenge in the study of transcriptional regulation. We hypothesized that
functional targets can be discovered by correlating binding and expression
profiles across multiple experimental conditions. To test this hypothesis, we
obtained ChIP-seq and RNA-seq data from matching cell types from the human
ENCODE resource, considered promoter-proximal and distal cumulative regulatory
models to map binding sites to genes, and used a combination of linear and
non-linear measures to correlate binding and expression data. We found that a
high degree of correlation between a gene's TF-binding and expression profiles
was significantly more predictive of the gene being differentially expressed
upon knockdown of that TF, compared to using binding sites in the cell type of
interest only. Remarkably, TF targets predicted from correlation across a
compendium of cell types were also predictive of functional targets in other
cell types. Finally, correlation across a time course of ChIP-seq and RNA-seq
experiments was also predictive of functional TF targets in that tissue.Comment: 15 pages + 8 pages supplementary material; 6 figures, 6 supplementary
figures, 5 supplementary table
Efficient and accurate causal inference with hidden confounders from genome-transcriptome variation data
Mapping gene expression as a quantitative trait using whole genome-sequencing
and transcriptome analysis allows to discover the functional consequences of
genetic variation. We developed a novel method and ultra-fast software Findr
for higly accurate causal inference between gene expression traits using
cis-regulatory DNA variations as causal anchors, which improves current methods
by taking into account hidden confounders and weak regulations. Findr
outperformed existing methods on the DREAM5 Systems Genetics challenge and on
the prediction of microRNA and transcription factor targets in human
lymphoblastoid cells, while being nearly a million times faster. Findr is
publicly available at https://github.com/lingfeiwang/findrComment: New result and method sections added. 38 pages, 4 figures, 1 table.
Supplementary: 20 pages, 10 figures, 2 table
Identification of active transcriptional regulatory elements from GRO-seq data
Modifications to the global run-on and sequencing (GRO-seq) protocol that enrich for 5'-capped RNAs can be used to reveal active transcriptional regulatory elements (TREs) with high accuracy. Here, we introduce discriminative regulatory-element detection from GRO-seq (dREG), a sensitive machine learning method that uses support vector regression to identify active TREs from GRO-seq data without requiring cap-based enrichment (https://github.com/Danko-Lab/dREG/). This approach allows TREs to be assayed together with gene expression levels and other transcriptional features in a single experiment. Predicted TREs are more enriched for several marks of transcriptional activation-including expression quantitative trait loci, disease-associated polymorphisms, acetylated histone 3 lysine 27 (H3K27ac) and transcription factor binding-than those identified by alternative functional assays. Using dREG, we surveyed TREs in eight human cell types and provide new insights into global patterns of TRE function
Biological function in the twilight zone of sequence conservation
Abstract Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast ‘twilight zone’ in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species’ population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional
- …